Task Scheduling in Big Data - Review, Research Challenges, and Prospects

نویسندگان

  • Kannan Govindarajan
  • Supun Kamburugamuve
  • Pulasthi Wickramasinghe
  • Geoffrey Fox
چکیده

In a Big data computing, the processing of data requires a large amount of CPU cycles and network bandwidth and disk I/O. Dataflow is a programming model for processing Big data which consists of tasks organized in a graph structure. Scheduling these tasks is one of the key active research areas which mainly aims to place the tasks on available resources. It is essential to effectively schedule the tasks, in a manner that minimizes task completion time and increases utilization of resources. In recent years, researchers have discussed and presented different task scheduling algorithms. In this research study, we have investigated the state-of-art of various task scheduling algorithms, scheduling considerations for batch and streaming processing, and task scheduling algorithms in the wellknown open-source big data platforms. Furthermore, this study proposes a new task scheduling system to alleviate the problems persists in the existing task scheduling for big data. Keywords—Big Data, MapReduce, Dataflow, Task Scheduling Model, Twister2, Static and Dynamic Task Scheduling.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Task Scheduling in Fog Computing: A Survey

Recently, fog computing has been introducedto solve the challenges of cloud computing regarding Internet objects. One of the challenges in the field of fog computing is the scheduling of tasks requested by Internet objects. In this study, a review of articles related to task scheduling in fog computing has been done. At first, the research questions and goals will be introduced, an...

متن کامل

Integrated modeling and solving the resource allocation problem and task scheduling in the cloud computing environment

Cloud computing is considered to be a new service provider technology for users and businesses. However, the cloud environment is facing a number of challenges. Resource allocation in a way that is optimum for users and cloud providers is difficult because of lack of data sharing between them. On the other hand, job scheduling is a basic issue and at the same time a big challenge in reaching hi...

متن کامل

Small Hydro-Power Plants in Kenya: A Review of Status, Challenges and Future Prospects

Small Hydro-power Plants (SHP) are an important source of electricity in many countries. However, little is known about SHP in Kenya. This paper reviews the status, challenges in implementation of SHP and prospects for future development of SHP in Kenya. The paper shows that SHP has not yet fully utilized the available hydro-power potential. The challenges associated with SHP development should...

متن کامل

Big data preprocessing: methods and prospects

The massive growth in the scale of data has been observed in recent years being a key factor of the Big Data scenario. Big Data can be defined as high volume, velocity and variety of data that require a new high-performance processing. Addressing big data is a challenging and time-demanding task that requires a large computational infrastructure to ensure successful data processing and analysis...

متن کامل

A Review of Data Intensive Computing

Data intensive computing is a common research problem in science, industry and computer academia. In recent twenty years, the explosive growth of science data has appeared all over the world. Typical data intensive computing applications include Internet text data processing, scientific research data processing, large scale graph computing, inverse and perspective problems. Data intensive compu...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017